feat(index): share IVF partition scans across batch vector queries#2
Open
sezruby wants to merge 1 commit into
Open
feat(index): share IVF partition scans across batch vector queries#2sezruby wants to merge 1 commit into
sezruby wants to merge 1 commit into
Conversation
Extend batch vector search (lance-format#6821) to the indexed/ANN path so a single multi-query request reads each IVF partition's storage once and scores every query that probes it, instead of re-running a full single-query plan per vector and unioning the results. - Add `VectorIndex::search_partitions_batch` + `supports_batch_partition_search` (defaulted so non-IVF indices stay explicitly unsupported). - Implement them for `IVFIndex` with a flat-style sub-index (IVF_FLAT/PQ/SQ/RQ): load each distinct partition once and accumulate one top-k heap per query, sharing the prefilter across the whole batch. - Add `ANNIvfBatchExec`, which ranks every query against the centroids, runs the shared-scan batch search, and emits `query_index`-tagged results; route to it from `Scanner::batch_indexed_vector_search` when the index family supports it. - Fall back to the existing per-query loop for HNSW, `refine_factor`, and mixed indexed/unindexed scans, so behavior never regresses. Per-query nprobes are honored statically (no adaptive late expansion), so recall matches repeated single-query search at fixed nprobes. Closes lance-format#6822 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Batch vector search (#6821, PR lance-format#6828) made indexed multi-query search work by looping the full single-query plan once per query vector (re-opening the index and rebuilding the prefilter each time) and unioning the results. This PR makes the indexed/ANN path share index-level state across the batch: it reads each IVF partition's storage once and scores every query that probes it.
Approach
VectorIndextrait (rust/lance-index/src/vector.rs): addsupports_batch_partition_search()andsearch_partitions_batch(...), both defaulted (default returns anot_supportederror), so non-IVF indices remain explicitly unsupported.IVFIndex(rust/lance/src/index/vector/ivf/v2.rs): implement the batch search for flat-style sub-indices (IVF_FLAT/PQ/SQ/RQ, i.e.supports_global_topk_heap()). It inverts the per-query partition lists, loads each distinct partition once, and accumulates one top-k heap per query against the loaded storage, reusingaccumulate_prepared_partition_search/global_heap_to_batch. The prefilter is built once and shared across all queries.ANNIvfBatchExec(rust/lance/src/io/exec/knn.rs): a single exec node that ranks every query against the centroids, runs the shared-scan batch search per delta, merges per-query top-k across deltas, and emits{query_index, _distance, _rowid}sorted by(query_index, _distance, _rowid).rust/lance/src/dataset/scanner.rs):batch_indexed_vector_searchtakes the new fast path when every segment is an IVF flat-style index and the scan is fully indexed (orfast_search); otherwise it falls back to the existing per-query loop (HNSW,refine_factor, mixed indexed/unindexed). No behavior regression.Known limitations (documented, deferred)
minimum_nprobes); the adaptive late-search expansion of the single-query path is not applied. Recall therefore matches repeated single-query search at fixed nprobes (the common case).refine_factor/reranking, and batch + unindexed-fragment combine fall back to the per-query loop (follow-ups).Test plan
cargo test -p lance --lib test_batch_knn— incl. updatedtest_batch_knn_indexed(asserts theANNIvfBatchplan and that batch results equal repeated single-query indexed search) and newtest_batch_knn_indexed_refine_falls_back.cargo test -p lance --lib dataset::scanner::test::test_knn(29) andindex::vector::ivf::v2(88) — no regressions.cargo fmt --all&&cargo clippy -p lance -p lance-index --tests --benches -- -D warnings.Python:
uv run pytest python/tests/test_vector_index.py -k batch→ 6 passed, incl. newtest_batch_indexed_query_matches_repeated_single_queries(3-query + single-query).ruff check/ruff format --checkclean.Added a batch-vs-repeated-single ANN benchmark (
benchmarks/test_search.py). Note: the shareddatasetsbenchmark fixture is currently broken in this checkout (pre-existinguse_legacy_formatdeprecation → error, also failstest_ann_no_refine), so I validated performance with a standalone script instead:50k rows, dim 128, IVF_PQ (64 partitions), m=32 queries, k=10, nprobes=8:
Closes lance-format#6822